Search CORE

10 research outputs found

Policy search with high-dimensional context variables

Author: Neumann G.
Parisi S.
Peters J.
Sugiyama M.
Tangkaratt V.
van Hoof H.
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 10/11/2016
Field of study

Direct contextual policy search methods learn to improve policy parameters and simultaneously generalize these parameters to different context or task variables. However, learning from high-dimensional context variables, such as camera images, is still a prominent problem in many real-world tasks. A naive application of unsupervised dimensionality reduction methods to the context variables, such as principal component analysis, is insufficient as task-relevant input may be ignored. In this paper, we propose a contextual policy search method in the model-based relative entropy stochastic search framework with integrated dimensionality reduction. We learn a model of the reward that is locally quadratic in both the policy parameters and the context variables. Furthermore, we perform supervised linear dimensionality reduction on the context variables by nuclear norm regularization. The experimental results show that the proposed method outperforms naive dimensionality reduction via principal component analysis and a state-of-the-art contextual policy search method

University of Lincoln Institutional Repository

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

TD-regularized actor-critic methods

Author: Khan M.
Parisi S.
Peters J.
Tangkaratt V.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/02/2019
Field of study

Actor-critic methods can achieve incredible performance on difficult reinforcement learning problems, but they are also prone to instability. This is partly due to the interaction between the actor and critic during learning, e.g., an inaccurate step taken by one of them might adversely affect the other and destabilize the learning. To avoid such issues, we propose to regularize the learning objective of the actor by penalizing the temporal difference (TD) error of the critic. This improves stability by avoiding large steps in the actor update whenever the critic is highly inaccurate. The resulting method, which we call the TD-regularized actor-critic method, is a simple plug-and-play approach to improve stability and overall performance of the actor-critic methods. Evaluations on standard benchmarks confirm this

arXiv.org e-Print Archive

Policy Search with High-Dimensional Context Variables

Author: Neumann G.
Parisi S.
Peters J.
Sugiyama M.
Tangkaratt V.
van Hoof H.
Publication venue
Publication date: 01/07/2017
Field of study

Direct Estimation of the Derivative of Quadratic Mutual Information with Application in Supervised Dimension Reduction

Author: Bache K.
Boumal N.
Chang C.-C.
Hiroaki Sasaki
Masashi Sugiyama
Nocedal J.
Sasaki H.
Suzuki T.
Vapnik V.
Voot Tangkaratt
Publication venue: 'MIT Press - Journals'
Publication date
Field of study

Survey of Model-Based Reinforcement Learning:Applications on Robotics

Author: A El-Fakdi
A Wilson
Athanasios S. Polydoros
AW Moore
AY Ng
B Depraetere
C Watkins
CF Touzet
E Theodorou
F Guenter
J Kober
J Morimoto
J Morimoto
J Peters
J Quinlan
JS Albus
Lazaros Nalpantidis
M Sugiyama
MG Lagoudakis
MP Deisenroth
P Abbeel
P Kormushev
P Zufiria
RE Bellman
RI Brafman
RS Sutton
S Schaal
SI Amari
T Lang
V Tangkaratt
V Tangkaratt
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/01/2017
Field of study

Survey of Model-Based Reinforcement Learning: Applications on Robotics

Author: A El-Fakdi
A Wilson
Athanasios S. Polydoros
AW Moore
AY Ng
B Depraetere
C Watkins
CF Touzet
E Theodorou
F Guenter
J Kober
J Morimoto
J Morimoto
J Peters
J Quinlan
JS Albus
Lazaros Nalpantidis
M Sugiyama
MG Lagoudakis
MP Deisenroth
P Abbeel
P Kormushev
P Zufiria
RE Bellman
RI Brafman
RS Sutton
S Schaal
SI Amari
T Lang
V Tangkaratt
V Tangkaratt
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

TD-regularized actor-critic methods

Author: D Silver
DV Prokhorov
E Greensmith
J Peters
Jan Peters
Mohammad Emtiyaz Khan
MP Deisenroth
RJ Williams
RS Sutton
S Boyd
Simone Parisi
T Yoshikawa
V Mnih
Voot Tangkaratt
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Reinforcement Learning and Adaptive Control

Author: A Geramifard
AY Ng
B Banerjee
B Kiumarsi
D Silver
DP Bertsekas
DP Bertsekas
FL Lewis
G Chowdhary
G Chowdhary
G Tao
H Modares
HB Ammar
J Peters
KJ Åström
KS Narendra
L Busoniu
L Kaelbling
L Liu
LP Kaelbling
M Kuss
M Liu
ME Taylor
RS Sutton
RS Sutton
S Levine
S Levine
S Ross
V Mnih
V Tangkaratt
XB Peng
XB Peng
Y Duan
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Evaluation of Appropriate Short-Term Mammographic Surveillance in Patients Who Undergo Breast-Conserving Surgery (BCS)

Author: A Jemal
A Recht
A Sardi
B Fisher
BG Haffty
DD Dershaw
DP Winchester
E Rapiti
EB Mendelson
Eleftherios P. Mamounas
F Ashkanani
I Gage
JL Khatcheressian
JM Kurtz
K Lin
LJ Solin
M Clarke
MA Climent Duran
S Tangkaratt
SC Weight
SG Orel
Sommer R. Gunia
TB Bevers
Thomas B. Poulton
TK Yau
Tricia L. Merrigan
V Pignatelli
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Transfer Learning for Multiagent Reinforcement Learning Systems

Author: Abadi Martín
Abel David
Agarwal Akshat
Albrecht Stefano V.
Alexander
Amir Ofra
Argall Brenna D.
Argente Estefania
Badue Claudine
Banerjee Bikramjit
Banerjee Bikramjit
Barrett Samuel
Barto Andrew G.
Bazzan Ana L. C.
Behboudian Paniz
Bengio Yoshua
Berner Christopher
Bianchi Reinaldo
Bianchi Reinaldo A. C.
Bianchi Reinaldo A. C.
Bignold Adam
Bogg Paul
Boutsioukis Georgios
Bradley Knox W.
Braylan Alexander
Brys Tim
Brys Tim
Busoniu Lucian
Capobianco Roberto
Castaneda Alvaro Ovalle
Cederborg Thomas
Chernova Sonia
Chernova Sonia
Chernova Sonia
Cobo Luis C.
Croonenborghs Tom
Cui Yuchen
Da Silva Felipe Leno
Da Silva Felipe Leno
Da Silva Felipe Leno
Da Silva Felipe Leno
Da Silva Felipe Leno
Da Silva Felipe Leno
Da Silva Felipe Leno
Danilo
de Cote Enrique Munoz
De Hauwere Y-M.
de la Cruz Gabriel V.
Devailly François-Xavier
Devin Coline
Devlin Sam
Devlin Sam
Didi Sabre
Dietterich Thomas G.
Diuk Carlos
Du Yunshu
Dusparic Ivana
Fang Zhou
Fernández Fernando
Fitzgerald Tesca
Florensa Carlos
Floyd Michael W.
Foerster Jakob
Foerster Jakob N.
Foerster Jakob N.
Freire Valdinei
Glatt Ruben
Goldberg David E.
Goodfellow Ian J.
Griffith Shane
Gupta Abhishek
Gupta Jayesh K.
Hanna Josiah
Hausknecht Matthew
Hausknecht Matthew
Hernandez-Leal Pablo
Hernandez-Leal Pablo
Hernandez-Leal Pablo
Hernandez-Leal Pablo
Hou Yaqing
Hu Junling
Hu Yujing
Hu Yujing
Hu Yujing
Ilhan Ercüment
Isele David
Jordan Scott M.
Judah Kshitij
Judah Kshitij
Judah Kshitij
Kelly Stephen
Kersting Kristian
Kim Dong-Ki
Kitano Hiroaki
Kober Jens
Koga M. L.
Koga Marcelo Li
Kolter J. Zico
Konidaris George
Kono Hitoshi
Krening Samantha
Lai Kwei-Herng
Lauer Martin
Le Hoang Minh
Leibo Joel Z.
Li Lihong
Liang Eric
Lin Xiaomin
Littman Michael L.
Littman Michael L.
Lopes Manuel
Lowe Ryan
Lyu Xueguang
MacGlashan James
MacGlashan James
Maclin Richard
Madden Michael G.
Mandel Travis
Martin
Matiisen Tambet
Matthew
MDP
Melo Francisco S.
Mnih Volodymyr
Narvekar Sanmit
Narvekar Sanmit
Narvekar Sanmit
Natarajan Sriraam
Ng Andrew Y.
Nguyen Thanh Thi
Omidshafiei Shayegan
Omidshafiei Shayegan
Pan Sinno J.
Panait Liviu
Paszke Adam
Peng Bei
Peng Bei
Pinto Lerrel
Poole David L.
Price Bob
Price Bob
Proper Scott
Puterman Martin L.
Ramachandran Deepak
Ramakrishnan Ramya
Reddy Tummalapalli Sudhamsh
Rosenfeld Ariel
Ryu Heechang
Sakato Tatsuya
Schaal Stefan
Schulman John
Schulman John
Shiarlis Kyriacos
Shoham Yoav
Shon Aaron P.
Shortreed Susan M.
Silver David
Sinapov Jivko
Sodomka Eric
Souza Lucas Oliveira
Stanley Kenneth O.
Stone Peter
Stone Peter
Stone Peter
Suay Halit Bener
Subramanian Kaushik
Subramanian Sriram Ganapathi
Sukhbaatar Sainbayar
Sukhbaatar Sainbayar
Sutton Richard S.
Svetlik Maxwell
Tamassia Marco
Tan Ming
Tangkaratt Voot
Tangkaratt Voot
Tanner Brian
Taylor Adam
Taylor Adam
Taylor Matthew E.
Taylor Matthew E.
Taylor Matthew E.
Tesauro Gerald
Thrun Sebastian
Todorov Emanuel
Torabi Faraz
Torabi Faraz
Torrey Lisa
Vamplew Peter
Vinyals Oriol
Vrancx Peter
Walsh Thomas J.
Wang Zhaodong
Watkins Christopher J.
Wiewiora Eric
Wooldridge Michael J.
Xiong Yanhai
Yang Tianpei
Yang Yaodong
Zhan Yusen
Zhifei Shao
Zhou L.
Zhou Ming
Zhu Changxi
Zimmer Matthieu
Publication venue: 'Morgan & Claypool Publishers LLC'
Publication date
Field of study

core

core